Sample Hierarchical Forecasting

Author

Andy Monger

Published

January 2, 2023

1 Summary

Using a sample data set from Kaggle, I’ll be analyzing sales history and developing forecasts at appropriate levels of aggregation. Models and forecasts will be generated using the Fable package from Rob J Hyndman detailed in Forecasting: Principals and Practice 3rd ed

I’ll be demonstrating the use of hierarchical forecasts against the sample data. Many businesses use a bottoms up approach, forecasting at the lowest level of aggregation (SKU), and summing up those forecasts to get totals for regions, categories, etc. I’ve had experience and frustration with this method, after going through bottom level forecasts, being comfortable with them, only to total them up to a number that doesn’t quite make sense given total history (either too low, too high, missing seasonality).

In this document I’ll be testing different forecast reconciliation methods, and different levels of aggregation.

The sample data I used was built for practicing designing tablau dashboards, and was originally developed for analyzing profit margin, sales metrics, and etc. I’ve found that sample data for forecasting tends to be very forecastable (low variability, few products, clear seasonality/trend), while actual business data tends to be much messier. I wanted to use a data set that was a little more painful to analyse that better represented challenges in business forecasting.

This workbook ended up being quite long since I built it while learning and working through the forecasting problem, rather than creating this doc as a showcase/presentation. I did my best to keep things clean and explain the process as I went though, but it really should be split into a few files: data exploration, category/sub_category forecasts, and the final forecasting at the product level.

2 Setup

First step is to load the appropriate R packages, load our data, and get our first view of what we’re working with

2.1 Load packages

Load R packages, code below includes quick description for each

# cleanup names
library(janitor)
# tidyverse for general data manipulation
library(tidyverse)
# use to create time series tsibbles for fable package
library(tsibble)
# feasts includes forecast accuracy functions
library(feasts)
# easier manipulation of date/time data
library(lubridate)
# pretty display of plots
library(patchwork)
library(purrr)
# library(tidymodels)
library(fable)
# better view of result tables
library(knitr)
# # last three are to create network diagrams for product hierarchy
library(treemap)
# more network diagram
library(data.tree)
# same as above
library(networkD3)
# more forecasting tools
library(fabletools)
# pretty output tables
library(kableExtra)

2.2 Load Data

Original data used can be found here. I simply downloaded it as a .xls file and uploaded it from my pc to keep things simple for myself. I had to go through a few different data sets to find one that fit this project

I loaded the file and then used the janitor package to change the column names to snake_case (my preference). First view of our data shows we have 9994 records across 21 columns. For forecasting purposes only a few of these will be useful, but we’ll keep them in for now until we can take a look at them.

There are two dates available, order_date and ship_date. I chose order_date for forecasting because it is closest to customer demand; ship_date can have many extra variables based on stock availability, warehouse issues, etc.

data <- readxl::read_xls("C:/Users/andre/Documents/r4ds/SampleSuperStore/Sample - Superstore.xls")

# clean up column names, using snake_case as default
data <- data |> janitor::clean_names()

# check column names
glimpse(data)
Rows: 9,994
Columns: 21
$ row_id        <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 1…
$ order_id      <chr> "CA-2016-152156", "CA-2016-152156", "CA-2016-138688", "U…
$ order_date    <dttm> 2016-11-08, 2016-11-08, 2016-06-12, 2015-10-11, 2015-10…
$ ship_date     <dttm> 2016-11-11, 2016-11-11, 2016-06-16, 2015-10-18, 2015-10…
$ ship_mode     <chr> "Second Class", "Second Class", "Second Class", "Standar…
$ customer_id   <chr> "CG-12520", "CG-12520", "DV-13045", "SO-20335", "SO-2033…
$ customer_name <chr> "Claire Gute", "Claire Gute", "Darrin Van Huff", "Sean O…
$ segment       <chr> "Consumer", "Consumer", "Corporate", "Consumer", "Consum…
$ country       <chr> "United States", "United States", "United States", "Unit…
$ city          <chr> "Henderson", "Henderson", "Los Angeles", "Fort Lauderdal…
$ state         <chr> "Kentucky", "Kentucky", "California", "Florida", "Florid…
$ postal_code   <dbl> 42420, 42420, 90036, 33311, 33311, 90032, 90032, 90032, …
$ region        <chr> "South", "South", "West", "South", "South", "West", "Wes…
$ product_id    <chr> "FUR-BO-10001798", "FUR-CH-10000454", "OFF-LA-10000240",…
$ category      <chr> "Furniture", "Furniture", "Office Supplies", "Furniture"…
$ sub_category  <chr> "Bookcases", "Chairs", "Labels", "Tables", "Storage", "F…
$ product_name  <chr> "Bush Somerset Collection Bookcase", "Hon Deluxe Fabric …
$ sales         <dbl> 261.9600, 731.9400, 14.6200, 957.5775, 22.3680, 48.8600,…
$ quantity      <dbl> 2, 3, 2, 5, 2, 7, 4, 6, 3, 5, 9, 4, 3, 3, 5, 3, 6, 2, 2,…
$ discount      <dbl> 0.00, 0.00, 0.00, 0.45, 0.20, 0.00, 0.00, 0.20, 0.20, 0.…
$ profit        <dbl> 41.9136, 219.5820, 6.8714, -383.0310, 2.5164, 14.1694, 1…

Dates were stored as date times instead of just dates. We don’t need the time component, so we’ll change them to dates so they’ll easier to work with. This could have been addressed as the data was loaded from excel, but I find it easier to let the function do its work and clean up afterwards. I also dropped row_id, it’s not meaningful and just an index provided by the base data.

data <- data |>
  mutate(order_date = as.Date(order_date), ship_date = as.Date(ship_date), .keep = "all") |>
  select(-row_id)

2.3 Structure of data

Using n_distinct as a function across all columns, we get a good first glance at the values, and can see which might be useful for creating meaningful subsections. There are few enough ship modes for it to be useful, but the method of shipping is unlikely to have any forecasting value. Segment, region, category, and subcategory all have few enough identities for them to be useful so we’ll explore them.

I drop the following columns:

order_id, ship_date, country, ship_mode, customer_id, customer_name, country, city, state, postal_code, region, sales, discount, profit

Something is very concerning with this view though, 1862 distinct values for product_id! That’s a lot of different items to forecast. What’s even more difficult is that there are only 14 distinct quantities listed, meaning that in our four years of daily sales history, there are only 14 different quantities sold per item. We’ll look at the quantities sold later, but this may indicate low volume/intermittent demand at the product_id level.

But our data is at the daily level, and is supposed to be “store sales”, which might be attributing to this and why we don’t see large quantities (like if we were forecasting shipments to distributors or vendors) and in the next step we’ll transform it to monthly and look at the distribution of quantities

kable(
  sapply(data, function(x) n_distinct(x)),
  col.names = "distinct values"
)
distinct values
order_id 5009
order_date 1237
ship_date 1334
ship_mode 4
customer_id 793
customer_name 793
segment 3
country 1
city 531
state 49
postal_code 631
region 4
product_id 1862
category 3
sub_category 17
product_name 1850
sales 6144
quantity 14
discount 12
profit 7545

Also a quick check to see what our date range is, four years of demand, not too bad! This will let us use three years for testing, keeping 2017 as a holdout to test our forecasting methods

kable(range(data$order_date),
      col.names = "",
      caption = "min/max dates"
)
min/max dates
2014-01-03
2017-12-30

3 Exploratory Data Analysis

Let’s get this data summarized by month, segment, category, sub_category, and product_id. I chose month because from my experience it’s the most common time period for business forecasts

After the data is summarized we still only see 19 different quantities being sold for products in a given month, reinforcing the suspicion of intermittent demand at the product_id level. Lets check our other variables before we go deeper into this issue though.

data <- data |>
  group_by(
    month = yearmonth(order_date),
    segment,
    product_id,
    category,
    sub_category
  ) |>
  summarize(quantity = sum(quantity), .groups = "drop")


kable(
  sapply(data, function(x) n_distinct(x)),
  col.names = "distinct values"
)
distinct values
month 48
segment 3
product_id 1862
category 3
sub_category 17
quantity 19

We need to check to make sure category/sub_category/product_id have strict nested relationship, meaning that each sub_category has a unique slice of the products, and each category has a unique slice of sub_categories.

By summarizing at product_id and sub_category we do get unique combinations, and filtering to product_ids and sub_categories with a count > 1 returns the empty tables below, proving a strict parent/child relationship.

I.E., product_id x is only in sub_category y, and sub_category y is only in category z. This is good!

data |>
  group_by(category, sub_category, product_id) |>
  summarize(n = n(), .groups = "drop") |>
  group_by(product_id) |>
  summarise(n = n(), .groups = "drop") |>
  filter(n > 1)
# A tibble: 0 × 2
# … with 2 variables: product_id <chr>, n <int>
data |>
  group_by(category, sub_category) |>
  summarize(n = n(), .groups = "drop") |>
  group_by(sub_category) |>
  summarise(n = n(), .groups = "drop") |>
  filter(n > 1)
# A tibble: 0 × 2
# … with 2 variables: sub_category <chr>, n <int>

Another way to check this and view the mapping is with a visual network. Feel free to mouse over, click and drag! For this visual I just wanted to practice making mappings and experimenting with the chart, but it’s not really more helpful than just a table.

product_heirarchy <- data |>
  group_by(category, sub_category) |>
  summarise(quantity = sum(quantity), .groups = "drop")

product_heirarchy$pathString <- paste("Total", product_heirarchy$category, product_heirarchy$sub_category, sep = "|")

asdf <- as.Node(product_heirarchy, pathDelimiter = "|")

asdf <- ToDataFrameNetwork(asdf, "name")

simpleNetwork(asdf, fontSize = 20)

Unlike our strict product hierarchy however, segment is a grouping based on another variable, likely customer and their business purpose. Each segment represents sales from all 3 categories, and based off the unique product_id count for each segment, there is clearly some overlap (remember we have 1862 product_ids)

kable(
  data |> group_by(segment, category) |>
    summarize(n = n(), .groups = "drop") |>
    group_by(segment) |>
    summarise(category_count = n(), .groups = "drop"),
  caption = "Number of categories"
)
Number of categories
segment category_count
Consumer 3
Corporate 3
Home Office 3
kable(
  data |> group_by(segment, product_id) |>
    summarize(n = n(), .groups = "drop") |>
    group_by(segment) |>
    summarise(product_id_count = n(), .groups = "drop"),
  caption = "Number of products"
)
Number of products
segment product_id_count
Consumer 1716
Corporate 1456
Home Office 1112

3.1 Shape of Demand

Now that we’ve got an understanding of the structure of the data, lets take a look at some more interesting charts and see how our demand is shaped

3.1.1 Product ID

Using a histogram we can see the distribution of units sold per month per product, the most common values being 2 or 3. While this on its own would be relatively challenging to forecast, there’s another issue that stands out: there are no periods with zero sales! Our data is a record of sales, and when there is no sale, of course there is no record. To forecast however, we’ll need to fill in these zeros

p2 <- data |>
  group_by(month, product_id) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  ggplot(aes(x = quantity)) +
  geom_histogram(binwidth = 1, color = "black", fill = "steelblue4") +
  theme_minimal() +
  scale_y_continuous(breaks = seq(0, 2500, by = 500), limits = c(0, 2500)) +
  # scale_x_binned(breaks = breaks_width(1, 0), limits = c(0,NA)) +
  labs(title = "Monthly sales by product ID")

print(p2)

The tsibble package has a very useful function to fill in implicit gaps for time series. We could pivot the data wider by product_id, getting a bunch of NA records, and replace them with zero, but this function is convenient and efficient. It uses the first recorded sale as the start, and the last recorded sale as the end, and fills in all gaps with zero

Replacing the implicit gaps with zeros reveals a major challenges with our data, 84% of our records are zeros!

p2b <- data |>
  group_by(month, product_id) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  arrange(month) |>
  as_tsibble(key = product_id, index = month) |>
  fill_gaps(quantity = 0L) |>
  as_tibble() |>
  ggplot(aes(x = quantity)) +
  geom_histogram(binwidth = 1, color = "black", fill = "firebrick4") +
  theme_minimal() +
  labs(title = "Monthly sales by product ID, implicit gaps removed")

p2b

kable(
  data |>
    group_by(month, product_id) |>
    summarise(quantity = sum(quantity), .groups = "drop") |>
    arrange(month) |>
    as_tsibble(key = product_id, index = month) |>
    fill_gaps(quantity = 0L) |>
    as_tibble() |> group_by(quantity) |>
    summarize(.count = n()) |>
    mutate(
      prop = round((.count / sum(.count) * 100), digits = 1)
    ) |>
    top_n(10, prop),
  caption = "Frequency of monthly quantity sold by product ID (top 10)"
)
Frequency of monthly quantity sold by product ID (top 10)
quantity .count prop
0 47190 83.5
1 784 1.4
2 2062 3.7
3 2104 3.7
4 1101 1.9
5 1172 2.1
6 597 1.1
7 588 1.0
8 281 0.5
9 262 0.5

One method of dealing with intermittent data is temporal aggregation. Instead of looking at just monthly sales, lets look at quarterly and annual.

Data at the quarterly level reduces the number of zeros from over ~47K to almost 12K, and at the annual level we only have around 700.

data |>
  group_by(quarter = yearquarter(month), product_id) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(key = product_id, index = quarter) |>
  fill_gaps(quantity = 0L) |>
  as_tibble() |>
  ggplot(aes(x = quantity)) +
  geom_histogram(binwidth = 1, color = "black", fill = "firebrick4") +
  theme_minimal() +
  labs(title = "Quarterly sales by product ID, implicit gaps removed")
data |>
  group_by(year = year(month), product_id) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(key = product_id, index = year) |>
  fill_gaps(quantity = 0L) |>
  as_tibble() |>
  ggplot(aes(x = quantity)) +
  geom_histogram(binwidth = 1, color = "black", fill = "firebrick4") +
  theme_minimal() +
  labs(title = "Annual sales by product ID, implicit gaps removed")

Maybe some years are behaving differently? For example, as products go obsolete their future values are zero. Unfortunately we don’t have any product management data included so we’ll have to make inferences ourselves.

Breaking it up by year we see that in both 2014 and 2017 we had very few products with zero sales, but in 2015 and 2016 we have many. Why is this?

Our fill gaps function STARTS at the first record, and ENDS on the last record. So for example if a product was sold in only 2014 and 2015, the fill gaps function wont fill in zeros beyond the last sale in 2015.

data |>
  group_by(quarter = yearquarter(month), product_id) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(key = product_id, index = quarter) |>
  fill_gaps(quantity = 0L) |>
  as_tibble() |>
  ggplot(aes(x = quantity)) +
  geom_histogram(binwidth = 1, color = "black", fill = "firebrick4") +
  theme_minimal() +
  labs(title = "Quarterly sales by product ID, implicit gaps removed") +
  facet_wrap(vars(year(quarter)))
data |>
  group_by(year = year(month), product_id) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(key = product_id, index = year) |>
  fill_gaps(quantity = 0L) |>
  as_tibble() |>
  ggplot(aes(x = quantity)) +
  geom_histogram(binwidth = 1, color = "black", fill = "firebrick4") +
  theme_minimal() +
  labs(title = "Annual sales by product ID, implicit gaps removed") +
  facet_wrap(vars(year))

While our charts above showed very few zeros at the annual level for 2017, when we use another method to just generate a list of products with no sales in 2017, there were actually 337 products. Our data is even MORE intermittent than the charts show!

Ideally we would have information on product life cycle and a process to determine when forecast history starts, when it ends, when replacement products took over, etc. But for this document I’m going to assume all products were available for the full range, and fill in zeros for any period that has no sales

At this point we have enough information to conclude that forecasting at the product_id level is not going to be pretty. If it was required for the business we could try temporal aggregation, and I come back to address that in the final section

# kable(
#   data |> group_by(product_id, year = year(month)) |>
#     summarize(qty = sum(quantity), .groups = "drop") |>
#     group_by(year) |>
#     summarize(product_count = n(), .groups = "drop"),
#   caption = "Unique product count by year"
# )

kable(
  nrow(data |> group_by(product_id, year = year(month)) |>
    summarize(qty = sum(quantity), .groups = "drop") |>
    pivot_wider(names_from = year, values_from = qty) |>
    filter(is.na(`2017`))),
  caption = "Number of products with no sales in 2017"
)
Number of products with no sales in 2017
x
337
# kable(
#   data |>
#     group_by(product_id) |>
#     summarize(total_qty = sum(quantity), .groups = "drop") |>
#     top_n(9, total_qty) |>
#     arrange(desc(total_qty)),
#   caption = "Top 10 products, qty sold 2014-2017"
# )

Below are our charts from before with the full range filled in, no reason to drill down to month since will look worse than quarterly, which already looks rough.

data |>
  group_by(quarter = yearquarter(month), product_id) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(key = product_id, index = quarter) |>
  fill_gaps(quantity = 0L, .full = TRUE) |>
  as_tibble() |>
  ggplot(aes(x = quantity)) +
  geom_histogram(binwidth = 1, color = "black", fill = "orange2") +
  theme_minimal() +
  labs(title = "Quarterly sales by product ID, implicit gaps removed full range") +
  facet_wrap(vars(year(quarter)))
data |>
  group_by(year = year(month), product_id) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(key = product_id, index = year) |>
  fill_gaps(quantity = 0L, .full = TRUE) |>
  as_tibble() |>
  ggplot(aes(x = quantity)) +
  geom_histogram(binwidth = 1, color = "black", fill = "orange2") +
  theme_minimal() +
  labs(title = "Annual sales by product ID, implicit gaps removed full range") +
  facet_wrap(vars(year))

One more view just for fun. This chart takes the first recorded sale of each product and counts their debut by month. It gives us an idea of when products get their first sale, but for course we’re missing all years < 2014 so this picture is incomplete.

data[match(unique(data$product_id), data$product_id), ] |>
  ggplot(aes(x = month)) +
  geom_histogram(bins = 48, color = "black", fill = "steelblue4") +
  theme_minimal() +
  labs(
    title = "First recorded sales",
    y = "Number of products"
  )

3.1.2 Segment

Forecasting for a series of counts (for example units sold) is a complex issue on its own. Forecast values must be integers greater than or equal to zero, and most forecasting models are usually not built to accommodate this. To address the positive integer issue, log transformations can be used when creating a model to force positive forecast values, but as long as the historical volumes are high enough this isn’t an issue.

With that in mind we’re going to look at the different ways to aggregate the data. First being segment as it’s the least familiar to me. The segments aren’t defined and I’m not sure what separates consumer from home office, but I would expect the corporate segment to have a different demand pattern than consumer, and may have different weights at the category/sub_category level.

The first glance at segment sales over time is a little messy to read. It seems like consumer has a different seasonality, but it’s hard to tell because of the different overall sales volume.

data |>
  group_by(segment, month) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  ggplot(aes(x = month, y = quantity, color = segment)) +
  geom_line() +
  theme_minimal() +
  labs(
    title = "Total Sales by Month",
    x = NULL
  ) +
  scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y")
for (target_category in unique(data$category)) {
  print(data |>
    filter(category == target_category) |>
    group_by(segment, month) |>
    summarize(quantity = sum(quantity), .groups = "drop") |>
    ggplot(aes(x = month, y = quantity, color = segment)) +
    geom_line() +
    theme_minimal() +
    labs(
      title = paste(target_category, "Sales by Month"),
      x = NULL
    )) +
    scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y")
}

The sales by category seem suspiciously similar across the total and all categories, but it’s hard to see for sure at the monthly level. Adding up the total sales history, and viewing the proportions by category shows that while total sales are different by segment, they are split proportionally the same by category! Really unusual and probably a function of the data generation process for this set.

These charts have some interactivity I was experimenting with, so you can hover your mouse and click.

for (target_category in unique(data$category)) {
  p14 <- data |>
    filter(category == target_category) |>
    group_by(segment, month) |>
    summarise(quantity = sum(quantity), .groups = "drop") |>
    ggplot(aes(x = month, y = quantity, color = segment)) +
    geom_line() +
    labs(title = target_category) +
    theme_minimal() +
    scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y") +
    labs(x = NULL)

  print(plotly::ggplotly(p14))
}

p15 <- data |>
  group_by(segment, category) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  ggplot(aes(x = segment, y = quantity, fill = category)) +
  geom_col() +
  theme(legend.position = "none") +
  theme_minimal() +
  theme(legend.position = "none")


p16 <- data |>
  group_by(segment, category) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  ggplot(aes(x = segment, y = quantity, fill = category)) +
  geom_col(position = "fill") +
  theme_minimal() +
  labs(y = "proportion")

plotly::subplot(plotly::ggplotly(p15), plotly::ggplotly(p16))

Diving deeper into the sub_category level, we see the same issue.

p16 <- data |>
  group_by(segment, sub_category, category) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  ggplot(aes(x = segment, y = quantity, fill = sub_category)) +
  geom_col(position = "fill") +
  theme_minimal() +
  facet_wrap(vars(category)) +
  theme(axis.text.x = element_text(
    vjust = 0.5,
    angle = 45
  )) +
  labs(x = NULL, y = "proportion")

plotly::ggplotly(p16)

Maybe they have different seasonality and trend components to them?

p8 <- data |>
  group_by(month, segment) |>
  summarize(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(key = segment, index = month) |>
  model(STL(quantity)) |>
  fabletools::components() |>
  ggplot(aes(x = month, y = season_year, color = segment)) +
  geom_line() +
  theme_minimal() +
  theme(
    axis.title.x = element_blank(),
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank()
  ) +
  labs(title = "Trend and seasonality decomposition") +
  theme(legend.position = "none")

p9 <- data |>
  group_by(month, segment) |>
  summarize(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(key = segment, index = month) |>
  model(STL(quantity)) |>
  fabletools::components() |>
  ggplot(aes(x = month, y = trend, color = segment)) +
  geom_line() +
  theme_minimal() +
  scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y") +
  labs(x = NULL)

print(p8 / p9)

But is this seasonality that we would want to capture through hierarchical forecasting? or is it just noise, hard to tell from the time series chart. A decomposition will help us see that the segments do actually behave differently. While the trend is very similar, the seasonality is clearly different. Consumer has a stronger holiday season in November and December, with a trough in January and February. I’m curious why corporate and consumer share a spike in September, as I thought that for consumer it would be back to school sales but regardless there is a seasonality component here we should attempt to capture in our forecast.

for (target_category in unique(data$category)) {
  p8 <- data |>
    filter(category == target_category) |>
    group_by(month, segment) |>
    summarize(quantity = sum(quantity), .groups = "drop") |>
    as_tsibble(key = segment, index = month) |>
    fill_gaps(quantity = 0) |>
    model(STL(quantity)) |>
    fabletools::components() |>
    ggplot(aes(x = month, y = season_year, color = segment)) +
    geom_line() +
    labs(title = paste(target_category, "Decomposition")) +
    theme_minimal() +
    theme(
      axis.title.x = element_blank(),
      axis.text.x = element_blank(),
      axis.ticks.x = element_blank()
    ) +
    theme(legend.position = "none")

  p9 <- data |>
    filter(category == target_category) |>
    group_by(month, segment) |>
    summarize(quantity = sum(quantity), .groups = "drop") |>
    as_tsibble(key = segment, index = month) |>
    fill_gaps(quantity = 0) |>
    model(STL(quantity)) |>
    fabletools::components() |>
    ggplot(aes(x = month, y = trend, color = segment)) +
    geom_line() +
    theme_minimal() +
    theme(
      axis.title.x = element_blank(),
      axis.text.x = element_blank(),
      axis.ticks.x = element_blank()
    ) +
    scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y")

  p10 <- data |>
    filter(category == target_category) |>
    group_by(month, segment) |>
    summarize(quantity = sum(quantity), .groups = "drop") |>
    as_tsibble(key = segment, index = month) |>
    fill_gaps(quantity = 0) |>
    model(STL(quantity)) |>
    fabletools::components() |>
    ggplot(aes(x = month, y = remainder, color = segment)) +
    geom_line() +
    theme_minimal() +
    tsibble::scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y") +
    labs(x = NULL) +
    theme(legend.position = "none")

  print(p8 / p9 / p10)
}

To check more rigorously, we can compare the seasonal strength and trend without being weight by total quantity. There are some differences in trend and seasonality, but whether these differences will give us better forecast accuracy is unclear. There are some ugly triangles here because I was experimenting with how to show the relationship in category and segment at the same time

data |>
  group_by(category, month, segment) |>
  summarize(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(key = c("segment", "category"), index = month) |>
  fill_gaps(quantity = 0) |>
  features(quantity, feat_stl) |>
  ggplot(aes(x = trend_strength, y = seasonal_strength_year)) +
  geom_point(aes(color = category), size = 2) +
  geom_polygon(aes(fill = segment, color = segment), alpha = .3) +
  geom_text(aes(label = segment), vjust = "inward", hjust = "inward") +
  theme_minimal() +
  coord_fixed() +
  labs(title = "Seasonal strength vs trend strength") +
  xlab(label = "Trend strength") +
  ylab(label = "Seasonal strength") +
  theme(legend.position = "right")

Not quite as impact when we zoom out however

data |>
  group_by(category, month, segment) |>
  summarize(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(key = c("segment", "category"), index = month) |>
  fill_gaps(quantity = 0) |>
  features(quantity, feat_stl) |>
  ggplot(aes(x = trend_strength, y = seasonal_strength_year, color = segment)) +
  geom_point(size = 2) +
  # geom_polygon(aes(fill = category), alpha = .3) +
  # geom_text(aes(label = segment), vjust = "inward", hjust = "inward") +
  theme_minimal() +
  coord_fixed() +
  labs(title = "Seasonal strength vs trend strength") +
  xlab(label = "Trend strength") +
  ylab(label = "Seasonal strength") +
  theme(legend.position = "right") +
  scale_x_continuous(limits = c(0, 1)) +
  scale_y_continuous(limits = c(0, 1))

3.1.3 Category and Sub Category

Category and sub_category are have a parent/child relationship and products are logically nested based on what type they are. Since we’ve ruled out using forecasting at the product_id level, we’ll look into sub_category as our lowest level of aggregation for forecasting, and get our first look at sales over time.

Its clear that we don’t have the same issue of intermittent demand like with product_id. Categories have clear trend and seasonality, and sub_categories have enough volume to generate a forecast from

for (target_category in unique(data$category)) {
  p99 <- data |>
    filter(category == target_category) |>
    group_by(month) |>
    summarise(quantity = sum(quantity), .groups = "drop") |>
    ggplot(aes(x = month, y = quantity)) +
    geom_line(color = "blue3") +
    labs(
      title = paste(target_category, "Total"),
      x = NULL
    ) +
    theme_minimal() +
    tsibble::scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y")


  p14 <- data |>
    filter(category == target_category) |>
    group_by(month, sub_category) |>
    summarise(quantity = sum(quantity), .groups = "drop") |>
    ggplot(aes(x = month, y = quantity)) +
    geom_line(color = "blue3") +
    facet_wrap(vars(sub_category), scales = "free_y") +
    labs(
      title = paste(target_category, "by Sub Category"),
      x = NULL
    ) +
    theme_minimal() +
    tsibble::scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y")

  print(p99)
  print(p14)
}

Total unique combos of category, sub_category, and segment are 54

data |>
  group_by(category, sub_category, segment) |>
  summarize(.groups = "drop")
# A tibble: 51 × 3
   category  sub_category segment    
   <chr>     <chr>        <chr>      
 1 Furniture Bookcases    Consumer   
 2 Furniture Bookcases    Corporate  
 3 Furniture Bookcases    Home Office
 4 Furniture Chairs       Consumer   
 5 Furniture Chairs       Corporate  
 6 Furniture Chairs       Home Office
 7 Furniture Furnishings  Consumer   
 8 Furniture Furnishings  Corporate  
 9 Furniture Furnishings  Home Office
10 Furniture Tables       Consumer   
# … with 41 more rows

Maybe include a decomp? Look into making it stationary for the arima forecast? Might clutter things up too much

# for (target_category in unique(data$category)) {
#   p8 <- data |>
#     filter(category == target_category) |>
#     group_by(month) |>
#     summarize(quantity = sum(quantity), .groups = "drop") |>
#     as_tsibble(index = month) |>
#     model(STL(quantity)) |>
#     components() |>
#     autoplot() + labs(title = target_category) +
#     theme_minimal()
#
#   print(p8)
# }
#
# data |>
#   group_by(month) |>
#   summarize(quantity = sum(quantity), .groups = "drop") |>
#   as_tsibble(index = month) |>
#   model(STL(quantity)) |>
#   components() |>
#   autoplot() + labs(title = "All categories") +
#   theme_minimal()

Chart below shows the distribution of quantities sold per month using category/sub_category and category/sub_category/segment. At the lowest level (category/sub_category/segment) our data starts getting a lot of zeros, and drilling this far down may lead to less accurate forecasts due to intermittency

data |>
  group_by(category, sub_category, month) |>
  summarize(
    quantity = sum(quantity), .groups = "drop"
  ) |>
  as_tsibble(
    key = c("category", "sub_category"),
    index = month
  ) |>
  fill_gaps(quantity = 0, .full = end()) |>
  as_tibble() |>
  ggplot(aes(x = quantity)) +
  geom_histogram(binwidth = 1, fill = "blue4") +
  theme_minimal() +
  labs(title = "Category/Sub_category")
data |>
  group_by(category, sub_category, segment, month) |>
  summarize(
    quantity = sum(quantity), .groups = "drop"
  ) |>
  as_tsibble(
    key = c("category", "sub_category", "segment"),
    index = month
  ) |>
  fill_gaps(quantity = 0, .full = end()) |>
  as_tibble() |>
  ggplot(aes(x = quantity)) +
  geom_histogram(binwidth = 1, fill = "blue4") +
  theme_minimal() +
  labs(title = "Category/Sub_category/Segment")

4 Develop Models

4.1 Model including segment

Clean data and fill gaps with zero. Code below sets up our history, including segment, as a time series object

range(data$month)
<yearmonth[2]>
[1] "2014 Jan" "2017 Dec"
fc_data_wsegment <- data |>
  group_by(category, sub_category, segment, month) |>
  summarize(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(key = c("category", "sub_category", "segment"), index = month)

nrow(has_gaps(fc_data_wsegment) |> filter(.gaps == 1))
[1] 40
fc_data_wsegment <- fc_data_wsegment |>
  fill_gaps(quantity = 0, .full = end())

nrow(has_gaps(fc_data_wsegment) |> filter(.gaps == 1))
[1] 0

Set up level of aggregation

Category and sub_category have an explicit parent/child relationship. Segment on the other hand will be treated as a group.

fc_data_wsegment <- fc_data_wsegment |> aggregate_key(((category / sub_category) * segment), quantity = sum(quantity))

Subset training data by filtering out 2017

train_data_wsegment <- fc_data_wsegment |> filter(year(month) < 2017)

Fit model to training data

It’s almost unanimous that combining different forecasting models causes less error, so for this approach I use a model combing ETS and ARIMA with equal weights. Rather than modeling and viewing the residuals of each model and fine tuning, we’ll allow the fable package to select the model with the residuals at each level of aggregation. The model selection we’ll do is whether to use the bottoms up or any min_trace solution.

Reconciliation method selected for this forecast is “mint_shrink” and “ols”, more detail on this at: mint.pdf (robjhyndman.com). In my past experience, mint_shrink has errored out due to low item weight, so we’ll include both.

Ideally we would manually fit a model to each series and adjust trend and seasonality based on input from different departments, this is where enterprise level business forecasting software comes in handy!

fit_wsegment <- train_data_wsegment |>
  model(
    arima = ARIMA(quantity),
    ets = ETS(quantity)
  ) |>
  mutate(combo = (arima + ets) / 2) |>
  select(-ets, -arima) |>
  reconcile(
    bu = bottom_up(combo),
    shrink = min_trace(combo, method = "mint_shrink"),
    ols = min_trace(combo, method = "ols")
  )

Generate a forecast from our model for the next year (2017) which we can test against our actuals

fc_wsegment <- fit_wsegment |>
  forecast(h = "1 year")

Here’s a quick pile of forecast visualizations at different levels. “bu” is the bottom’s up approach, simply summing up the forecasts of the lower levels to get the next level of aggregation “combo” is the direct forecast at a given level of aggregation using the arima and ets models weighted evenly “ols” is the ols method “shrink” is the mint_shrink method

for (target_category in unique(data$category)) {
  pfc1 <- fc_wsegment |>
    filter(category == target_category, is_aggregated(segment), !is_aggregated(sub_category)) |>
    autoplot(fc_data_wsegment, alpha = .8, level = NULL) +
    facet_wrap(vars(sub_category), scales = "free_y") +
    labs(title = target_category) +
    theme_minimal() + theme(axis.text.x = element_text(
      vjust = 0.5,
      angle = 45
    )) +
    scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y")

  pfc2 <- fc_wsegment |>
    filter(category == target_category, is_aggregated(segment), is_aggregated(sub_category)) |>
    autoplot(fc_data_wsegment, alpha = .8, level = NULL) +
    labs(title = target_category) +
    theme_minimal() +
    scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y")


  print(pfc2)
  print(pfc1)
}
fc_wsegment |>
  filter(is_aggregated(segment), is_aggregated(sub_category), is_aggregated(category)) |>
  autoplot(fc_data_wsegment, alpha = .8, level = NULL) +
  facet_wrap(vars(.model)) +
  theme_minimal() +
  labs(title = "Total") +
  theme(axis.text.x = element_text(
    vjust = 0.5,
    angle = 45
  )) +
  scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y")
fc_wsegment |>
  filter(!is_aggregated(segment), is_aggregated(sub_category), is_aggregated(category)) |>
  autoplot(fc_data_wsegment, alpha = .8, level = NULL) +
  facet_wrap(vars(segment)) +
  theme_minimal() +
  labs(title = "Segment") +
  theme(axis.text.x = element_text(
    vjust = 0.5,
    angle = 45
  )) +
  scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y")

4.2 Model without segment

And here we do all the same without including segment, just category/sub_category

Fill gaps without segment

fc_data_wosegment <- data |>
  group_by(category, sub_category, month) |>
  summarize(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(key = c("category", "sub_category"), index = month)

nrow(has_gaps(fc_data_wosegment) |> filter(.gaps == 1))
[1] 6
fc_data_wosegment <- fc_data_wosegment |>
  fill_gaps(quantity = 0, .full = end())


nrow(has_gaps(fc_data_wosegment) |> filter(.gaps == 1))
[1] 0

Set up aggregation

fc_data_wosegment <- fc_data_wosegment |> aggregate_key(category / sub_category, quantity = sum(quantity))

Subset training data, filtering out 2017

train_data_wosegment <- fc_data_wosegment |> filter(year(month) < 2017)

Fit model to training data

fit_wosegment <- train_data_wosegment |>
  model(
    arima = ARIMA(quantity),
    ets = ETS(quantity)
  ) |>
  mutate(combo = (arima + ets) / 2) |>
  select(-ets, -arima) |>
  reconcile(
    bu = bottom_up(combo),
    shrink = min_trace(combo, method = "mint_shrink"),
    ols = min_trace(combo, method = "ols")
  )

Generate a forecast from our model

fc_wosegment <- fit_wosegment |>
  forecast(h = "1 year")

Forecast visualization

for (target_category in unique(data$category)) {
  pfc1 <- fc_wosegment |>
    filter(category == target_category, !is_aggregated(sub_category)) |>
    autoplot(fc_data_wosegment, alpha = .8, level = NULL) +
    facet_wrap(vars(sub_category), scales = "free_y") +
    labs(title = target_category) +
    theme_minimal() +
    scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y")

  pfc2 <- fc_wosegment |>
    filter(category == target_category, is_aggregated(sub_category)) |>
    autoplot(fc_data_wosegment, alpha = .8, level = NULL) +
    labs(title = target_category) +
    theme_minimal() +
    scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y")


  print(pfc2)
  print(pfc1)
}
fc_wosegment |>
  filter(is_aggregated(sub_category), is_aggregated(category)) |>
  autoplot(fc_data_wosegment, alpha = .8, level = NULL) +
  facet_wrap(vars(.model)) +
  theme_minimal() +
  scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y")

5 Forecast Accuracy comparison

  • As expected, the bottom up approach gets less accurate when we add segmentation because each series has less information

  • Directly forecasting the total was the most accurate at that level

  • min trace solution created forecasts with the lowest MASE for category and sub_category. Minor differences between ols and mint shrink

  • Segmentation in general only had a minor impact to MASE

The decisions to make here are which reconciliation method to use, and whether to include segmentation in our model. While segmentation didn’t impact forecast performance too much, I’m going to choose to create the final forecast using it. There are some theoretical benefits to the business for having forecasts based on region and it isn’t making our test forecast worse, so we’ll assume there is a hypothetical business need for it. I chose the min_trace reconciliation method using mint_shrink because it created slightly more accurate forecasts at the lowest level.

fa1 <- fc_wsegment |>
  filter(is_aggregated(category), is_aggregated(sub_category), is_aggregated(segment)) |>
  accuracy(
    data = fc_data_wsegment,
    measures = list(mase = MASE)
  ) |>
  group_by(.model) |>
  summarize(mase = mean(mase)) |>
  mutate(include_segment = "yes", level = "total")


fa2 <- fc_wosegment |>
  filter(is_aggregated(category), is_aggregated(sub_category)) |>
  accuracy(
    data = fc_data_wosegment,
    measures = list(mase = MASE)
  ) |>
  group_by(.model) |>
  summarize(mase = mean(mase)) |>
  mutate(include_segment = "no", level = "total")

fa3 <- fc_wsegment |>
  filter(!is_aggregated(category), is_aggregated(sub_category), is_aggregated(segment)) |>
  accuracy(
    data = fc_data_wsegment,
    measures = list(mase = MASE)
  ) |>
  group_by(.model) |>
  summarize(mase = mean(mase)) |>
  mutate(include_segment = "yes", level = "category")


fa4 <- fc_wosegment |>
  filter(!is_aggregated(category), is_aggregated(sub_category)) |>
  accuracy(
    data = fc_data_wosegment,
    measures = list(mase = MASE)
  ) |>
  group_by(.model) |>
  summarize(mase = mean(mase)) |>
  mutate(include_segment = "no", level = "category")

fa5 <- fc_wsegment |>
  filter(!is_aggregated(category), !is_aggregated(sub_category), is_aggregated(segment)) |>
  accuracy(
    data = fc_data_wsegment,
    measures = list(mase = MASE)
  ) |>
  group_by(.model) |>
  summarize(mase = mean(mase)) |>
  mutate(include_segment = "yes", level = "sub_category")


fa6 <- fc_wosegment |>
  filter(!is_aggregated(category), !is_aggregated(sub_category)) |>
  accuracy(
    data = fc_data_wosegment,
    measures = list(mase = MASE)
  ) |>
  group_by(.model) |>
  summarize(mase = mean(mase)) |>
  mutate(include_segment = "no", level = "sub_category")

accuracytable <- bind_rows(fa1, fa2, fa3, fa4, fa5, fa6)

accuracytable <- accuracytable |> pivot_wider(names_from = .model, values_from = mase)

kable(accuracytable,
  caption = "MASE comparison \n
      bu = Bottom-up, \n
      combo = base forecast at that level of aggregation \n
      mint = min trace aggregation method",
  digits = 2
)
MASE comparison bu = Bottom-up, combo = base forecast at that level of aggregation mint = min trace aggregation method
include_segment level bu combo ols shrink
yes total 2.08 1.30 1.39 1.40
no total 1.60 1.30 1.33 1.39
yes category 1.96 1.48 1.42 1.41
no category 1.56 1.48 1.41 1.44
yes sub_category 1.30 1.22 1.16 1.14
no sub_category 1.22 1.22 1.18 1.17

5.1 Final Forecast

Usually I try to avoid “final” because it leads to files names like final_rev2, and final_final, but I think it’s okay for this document.

fit_wsegment <- fc_data_wsegment |>
  model(
    arima = ARIMA(quantity),
    ets = ETS(quantity)
  ) |>
  mutate(combo = (arima + ets) / 2) |>
  select(-ets, -arima) |>
  reconcile(
    mint = min_trace(combo, method = "mint_shrink")
  )

Forecast out one year

fc <- fit_wsegment |>
  forecast(h = "1 year")

Final forecasts

fc |>
  filter(is_aggregated(sub_category), is_aggregated(segment), .model == "mint") |>
  autoplot(fc_data_wsegment, level = 90) +
  facet_wrap(vars(category), scales = "free_y") +
  theme_minimal() +
  scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y") +
  labs(title = "Category")
for (target_category in unique(data$category)) {
  p <- fc |>
    filter(
      category == target_category, !is_aggregated(sub_category),
      is_aggregated(segment),
      .model == "mint"
    ) |>
    autoplot(fc_data_wsegment, level = 90) +
    facet_wrap(vars(sub_category), scales = "free_y") +
    labs(title = target_category) +
    theme_minimal() +
    scale_x_yearmonth(date_breaks = "1 year", date_labels = "%Y") + theme(axis.text.x = element_text(
      vjust = 0.5,
      angle = 45
    ))

  print(p)
}

5.2 Clean up the final forecast for export, remove zeros, change to integer values

Example of what exported data would look like, issues rounding leads to incoherent forecasts and removing zeros does as well. Be careful when interpreting and publishing the different levels of aggregation

final_fcst <- fc |>
  filter(.model == "mint") |>
  as_tibble() |>
  select(-.model) |>
  mutate(
    date = make_date(year = year(month), month = month(month), day = 1),
    `80%` = hilo(quantity, 80),
    `90%` = hilo(quantity, 90)
  ) |>
  select(-quantity, -month) |>
  unpack_hilo(c(`80%`, `90%`)) |>
  relocate(date)


final_fcst <- final_fcst |>
  mutate(across(
    .cols = .mean:`90%_upper`,
    .fns = ~ pmax(., 0)
  )) |>
  mutate(across(
    .cols = .mean:`90%_upper`,
    .fns = ~ as.integer(.x)
  ))



write_csv(final_fcst, file = "final_fcst")

kable(head(final_fcst, n = 10),
  caption = "Example of data export \n"
)
Example of data export
date category segment sub_category .mean 80%_lower 80%_upper 90%_lower 90%_upper
2018-01-01 Furniture Consumer Bookcases 10 1 19 0 22
2018-02-01 Furniture Consumer Bookcases 11 2 20 0 22
2018-03-01 Furniture Consumer Bookcases 15 6 24 3 27
2018-04-01 Furniture Consumer Bookcases 17 7 26 5 29
2018-05-01 Furniture Consumer Bookcases 17 7 26 4 29
2018-06-01 Furniture Consumer Bookcases 19 9 28 6 31
2018-07-01 Furniture Consumer Bookcases 15 6 25 3 28
2018-08-01 Furniture Consumer Bookcases 14 4 24 2 26
2018-09-01 Furniture Consumer Bookcases 22 12 32 9 35
2018-10-01 Furniture Consumer Bookcases 18 9 28 6 31

6 Coming back to this project to forecast at product level

I was planning on leaving this analysis to the sub_category level, but I ended up coming back later wanting to create product level forecasts. Turns out I shouldn’t have named that table “final_fcst” after all!

As much as I would love to only forecast high volume series, business forecasting doesn’t usually have that luxury. Creating category and sub_category level forecast can be extremely helpful to many parts of the a business, from trend analysis to allocating sales resources to factory and staff planning, but forecasts are often used for supply chain purchasing purposes which require forecasts at the lowest level of aggregation.

In this section I’m going to do the following and address the issues caused by each

  • Analyse item level at daily, weekly, monthly, quarterly, and annual to see if we can see trends at any of those levels

    • We built a histogram of historical sales quantities at these levels, but we’ll dive deeper in this section
  • Apply Cronston model which is built for intermittent demand

  • Disaggregate the Category/Sub_Category forecast, weighted by cronston forecast (come back to this if we have poor performance with reconciliation method) This is a less fancy solution but would work

  • Apply temporal hierarchies combined with cronston

    • We’ll quickly run into issues using the cronston model that conflicts with a “one number” plan/forecast. We’ll see if we can take the best of both worlds and create a model that creates sub_category, category, and total forecasts that make sense, while creating also creating product level forecasts
  • Convert forecast values to integers

    • We can’t just round like before because many if not most forecast values will be less than one at the item level

6.1 Reload the data and clean

We used the “data” variable in some transformations earlier so we can’t use it again. Lets reload it as data2 and clean it up with the same code as before.

data2 <- readxl::read_xls("C:/Users/andre/Documents/r4ds/SampleSuperStore/Sample - Superstore.xls")

# clean up column names, using snake_case as default
data2 <- data2 |> janitor::clean_names()

data2 <- data2 |>
  mutate(order_date = as.Date(order_date), ship_date = as.Date(ship_date), .keep = "all") |>
  select(-row_id)

data2 <- data2 |>
  group_by(
    order_date,
    product_id,
    category,
    sub_category
  ) |>
  summarize(quantity = sum(quantity), .groups = "drop")

Set up a list 20 of random products to take a look at

set.seed(99)
product_list <- unique(data2$product_id)
product_sample <- sample(product_list, 20)

product_sample
 [1] "FUR-BO-10000780" "FUR-FU-10001037" "OFF-PA-10003205" "OFF-PA-10000726"
 [5] "FUR-CH-10002965" "TEC-AC-10004568" "OFF-AR-10002956" "FUR-CH-10003774"
 [9] "FUR-FU-10003347" "TEC-CO-10001766" "TEC-PH-10001459" "OFF-AR-10004511"
[13] "OFF-BI-10000050" "OFF-LA-10004272" "FUR-FU-10002253" "TEC-AC-10003023"
[17] "OFF-ST-10001505" "OFF-AR-10001955" "OFF-LA-10002034" "OFF-BI-10001098"

Set up our time series table with category, sub_category, and product_id. I chose to drop segment in this case, we already know that our product_id level data is sparse, so no reason to divide it into three groups by segment. Here we take our data; summarize on order_date, category, sub_category, and product_id; coerce to a tsibble, and fill in the gaps for the full range in one step

data2_ts <- data2 |>
  group_by(order_date, category, sub_category, product_id) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(
    key = c("product_id", "category", "sub_category"),
    index = order_date
  ) |>
  fill_gaps(quantity = 0, .full = TRUE)

Quick look at the daily data, yep its a mess

data2_ts |>
  filter(product_id %in% product_sample) |>
  ggplot(aes(x = order_date, y = quantity)) +
  geom_line(aes(color = product_id)) +
  theme_minimal() +
  theme(legend.position = "none") +
  labs(
    title = "Random 20 product_id",
    x = "day"
  )

6.2 Check trend/seasonality at different temporal aggregations

Set up weekly, monthly, quarterly, and annual time series, maybe we can get some value from temporal aggregation (aggregating time to these different levels)

data2_ts_weekly <- data2_ts |>
  as_tibble() |>
  group_by(
    week = yearweek(order_date),
    product_id,
    category,
    sub_category
  ) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(
    key = c("product_id", "category", "sub_category"),
    index = week
  )

data2_ts_monthly <- data2_ts |>
  as_tibble() |>
  group_by(
    month = yearmonth(order_date),
    product_id,
    category,
    sub_category
  ) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(
    key = c("product_id", "category", "sub_category"),
    index = month
  )

data2_ts_quarterly <- data2_ts |>
  as_tibble() |>
  group_by(
    quarter = yearquarter(order_date),
    product_id,
    category,
    sub_category
  ) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(
    key = c("product_id", "category", "sub_category"),
    index = quarter
  )

data2_ts_annual <- data2_ts |>
  as_tibble() |>
  group_by(
    year = year(order_date),
    product_id,
    category,
    sub_category
  ) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(
    key = c("product_id", "category", "sub_category"),
    index = year
  )

Plot that same random sample… and we see that it’s still essentially random noise, we’re not getting any trend or seasonality

p1 <- data2_ts_weekly |>
  filter(product_id %in% product_sample) |>
  ggplot(aes(x = week, y = quantity, color = product_id, group = product_id)) +
  geom_line() +
  theme_minimal() +
  theme(legend.position = "none")

p2 <- data2_ts_monthly |>
  filter(product_id %in% product_sample) |>
  ggplot(aes(x = month, y = quantity, color = product_id)) +
  geom_line() +
  theme_minimal() +
  theme(legend.position = "none")

p3 <- data2_ts_quarterly |>
  filter(product_id %in% product_sample) |>
  select(product_id, quarter, quantity) |>
  ggplot(aes(x = quarter, y = quantity, color = product_id, group = product_id)) +
  geom_line() +
  theme_minimal() +
  theme(legend.position = "none")

p4 <- data2_ts_annual |>
  filter(product_id %in% product_sample) |>
  select(product_id, year, quantity) |>
  ggplot(aes(x = year, y = quantity, color = product_id)) +
  geom_line() +
  theme_minimal() +
  theme(legend.position = "none")

((p1 | p2) / (p3 | p4)) + plot_annotation(title = "random 20 product_id")

How about the top 10 selling products? Still not great, we would expect very clear trend and seasonality from the highest volume, but we’re not getting it. At this point we can assume all product data is intermittent and we can use the Cronston method to forecast them, and forget the idea of using temporal aggregation.

top10 <- data2_ts |>
  as_tibble() |>
  group_by(product_id) |>
  summarise(quantity = sum(quantity)) |>
  arrange(desc(quantity)) |>
  top_n(10, quantity) |>
  select(product_id)

p1 <- data2_ts_weekly |>
  filter(product_id %in% top10$product_id) |>
  ggplot(aes(x = week, y = quantity, color = product_id)) +
  geom_line() +
  theme_minimal() +
  labs(title = "weekly") +
  theme(legend.position = "none")

p2 <- data2_ts_monthly |>
  filter(product_id %in% top10$product_id) |>
  ggplot(aes(x = month, y = quantity, color = product_id)) +
  geom_line() +
  theme(legend.position = "none") +
  theme_minimal() +
  labs(title = "monthly") +
  theme(legend.position = "none")

p3 <- data2_ts_quarterly |>
  filter(product_id %in% top10$product_id) |>
  ggplot(aes(x = quarter, y = quantity, color = product_id)) +
  geom_line() +
  theme(legend.position = "none") +
  theme_minimal() +
  labs(title = "quarterly") +
  theme(legend.position = "none")

p4 <- data2_ts_annual |>
  filter(product_id %in% top10$product_id) |>
  ggplot(aes(x = year, y = quantity, color = product_id)) +
  geom_line() +
  theme(legend.position = "none") +
  theme_minimal() +
  labs(title = "annual") +
  theme(legend.position = "none")

(p1 | p2) / (p3 | p4) + plot_annotation(title = "Top 10 product_id")

6.3 Forecast with Crontson for intermittent demand

I defaulted here to monthly data, which from my experience is the typical period for business forecasting. Then fit a model using the Cronston method, and forecast 12 months, using 2014-2016 to fit the model, and 2017 to get forecast accuracy metrics. The test model generates 297 null models, about 16% of our products are erroring out! This really isn’t a problem now, since the null models will generate a NA forecast, but it will be a problem if we chose to reconcile it using the reconcile function in the fable package. The null models are likely because the history for that product only has one period of sales, and we need at least two periods to create a model.

monthly_cronston_fit_test <- data2_ts_monthly |> 
  filter(year(month) < 2017) |> 
  model(.model = CROSTON(quantity))
Warning: 297 errors (1 unique) encountered for .model
[297] At least two non-zero values are required to use Croston's method.
monthly_cronston_fc_test <- monthly_cronston_fit_test |>
  forecast(h = 12)

We can test this assumption by filtering to the null models from our model table, and then filter our sales data by that product list with sales > 0. Length of this table is 297, meaning that this is indeed the cause of our errors. Null models will create NA forecast values, which isn’t a problem for now, we can just replace those with zeros, and in a business process they would be marked for review.

We’ll create the list of products and an object we can filter for later

monthly_cronston_nulls_test <- monthly_cronston_fit_test |> 
  mutate(null = is_null_model(.model)) |> 
  filter(null == 1)


length(unique(monthly_cronston_nulls_test$product_id))
[1] 297

Lets take a look at total quantities over time, and yikes! These products have a lot more sales in 2017. This will make things goofy comparing accuracy to our previous method (using only category and sub_category). Our category/sub_category forecast was under forecasting, good reason to think these products may be why we had so much growth in 2017

data2_ts_monthly |> filter(product_id %in% monthly_cronston_nulls_test$product_id) |> 
  index_by(month) |> 
  summarise(quantity = sum(quantity)) |> 
  ggplot(aes(x = month, y = quantity)) +
  geom_line() +
  theme_minimal()

When we add up the cronston forecasts, we quickly see a problem, they’re flat! Cronston method produces forecasts with no trend, no seasonality, and no confidence range. Not very helpful if using a bottom up approach!

monthly_cronston_fc_test |> 
  filter(!(product_id %in% monthly_cronston_nulls_test$product_id)) |> 
  as_tibble() |> 
  group_by(category, month) |> 
  summarize(fcst_qty = sum(.mean), .groups = "drop") |> 
  ggplot(aes(month, fcst_qty, col = category)) +
  geom_line() +
  theme_minimal() +
  labs(title = "Cronston produces flat forecast, and category forecasts makes no sense")

6.3.1 Cronston and proportional split

https://robjhyndman.com/papers/foresight.pdf

Lets generate MASE metrics for our 2017 forecast, taking the mean by category

As Marcel the Shell famously said, “compared to what??”; the metrics aren’t very meaningful when there’s nothing to compare to yet. But we can tell that accuracy is not really great, forecasted values are worse than the one step ahead naive forecast. but this makes sense, considering that the most accurate forecast would probably be zeros across the board.

# monthly_cronston_fc_accuracy_test <-
#   monthly_cronston_fc_test |>
#   filter(!(product_id %in% monthly_cronston_nulls_test$product_id)) |>
#   accuracy(
#     data = data2_ts_monthly |>
#       filter(!(product_id %in% monthly_cronston_nulls_test$product_id)
#     ),
#     measures = list(mase = MASE)
#   )
# 
# kable(monthly_cronston_fc_accuracy_test |>
#   group_by(category) |>
#   summarise(MASE = mean(mase)), caption = "cronston test accuracy")

Also, compare the 2017 forecasts(not our test) to our previous test forecast by category/sub_category. The total by category actually roll up fairly close, but we quickly see a the problem with the Cronston method when we chart by month. The forecast is flat when we roll it up! This is always the case when using this model, its effective for intermittent demand, but creates a flat forecast (no trend or seasonality). This item level forecast might be helpful for supply planning on intermittent products, but if we roll it up it doesn’t make much sense ***** come back to, need to discuss models that dropped out that make comparison difficult, maybe just view v actuals*****

# monthly_cronston_fc_test |>
#   as_tibble() |>
#   group_by(category) |>
#   summarise(quantity = sum(.mean, na.rm = T))
# 
# fc_wosegment |> 
#   as_tibble() |> 
#   filter(!is_aggregated(category),
#          !(as.character(product_id) %in% monthly_cronston_nulls_test$product_id),
#          .model == "ols"
#   ) |> 
#   group_by(category) |>
#   summarise(quantity = sum(.mean, na.rm = T))
# 
# fc_method_compare <- maybe_works_fit2_rec_fc_test |>
#   filter(
#     !is_aggregated(product_id),
#     !(as.character(product_id) %in% monthly_cronston_nulls_test$product_id),
#     .model == "mint"
#   ) |> 
#   as_tibble() |> 
#   group_by(month, category) |>
#   summarise(aggregated_fc = sum(.mean)) |>
#   left_join(monthly_cronston_fc_test |> as_tibble() |> group_by(month, category) |> summarise(cronston_fc = sum(.mean, na.rm = T)), join_by("month" == "month", "category" == "category")) |> 
#   left_join(
#     data2_ts_monthly |> filter(
#       !(product_id %in% monthly_cronston_nulls_test$product_id),
#       year(month) == 2017) |> as_tibble() |> 
#       group_by(month, category) |> 
#       summarise(actuals = sum(quantity)),
#     join_by("month" == "month", "category" == "category")
#   )
# 
# 
# 
# 
# fc_method_compare |>
#   pivot_longer(cols = c(aggregated_fc, cronston_fc, actuals), names_to = "method", values_to = "quantity") |>
#   mutate(category = as.character(category)) |>
#   ggplot(aes(x = month, y = quantity)) +
#   geom_line(aes(color = method)) +
#   facet_wrap(vars(category), scales = "free", ncol = 1) +
#   theme_minimal() +
#   labs(title = "2017 sales")

One way to fix this is to take the proportions of our product_id level forecast, and apply those across sub_categories using our previous reconciled forecast at that level ****come back to this too****.

# cronstonfc_prop_test <- monthly_cronston_fc_test |>
#   as_tibble() |>
#   filter(!(product_id %in% monthly_cronston_nulls_test))
#   select(product_id, category, sub_category, .mean) |>
#   mutate(.mean = replace_na(.mean, 0)) |>
#   group_by(product_id, category, sub_category) |>
#   summarise(qty = sum(.mean), .groups = "drop") |>
#   group_by(sub_category) |>
#   mutate(prop = qty / sum(qty))


# data2_ts |> as_tibble() |> group_by(category, sub_category, product_id) |>
# summarise(quantity = sum(quantity), .groups = "drop") |>
# group_by(category) |> mutate(cat_prop = quantity/sum(quantity)) |> ungroup() |>
# group_by(sub_category) |> mutate(sub_cat_prop = quantity/sum(quantity)) |> ungroup() |>
# mutate(total_prop = quantity/sum(quantity)) |> select(-quantity)

Check to make sure totals are rolling up correctly when using this method, compared to our prior forecast ****come back to this too****.

# sum up final forecast and drop segment for comparison, drop range because it wont be relevant at the product level
# final_fc_for_props <- final_fcst |>
#   filter(is_aggregated(segment)) |>
#   select(-segment) |>
#   select(date:.mean) |>
#   mutate(category = as.character(category), sub_category = as.character(sub_category))
# 
# asdf <- final_fc_for_props |>
#   right_join(cronstonfc_prop, join_by("category" == "category", "sub_category" == "sub_category")) |>
#   mutate(item_fc = (.mean * prop))
# 
# asdf |>
#   group_by(category) |>
#   summarise(qty = sum(item_fc))
# final_fcst |>
#   filter(!is_aggregated(segment), !is_aggregated(sub_category)) |>
#   group_by(category) |>
#   summarise(qty = sum(.mean))

6.4 Can we use the hierarchical forecast approach?

Lets set up our data to the monthly level with the item hierarchy (category/sub_category/product_id) and go through this approach, dropping the 297 products that only have one observed sale period

data2_ts_monthly_ag_test <- data2_ts_monthly |> 
  filter(year(month) < 2017,
         !(product_id %in% monthly_cronston_nulls_test$product_id)) |> 
  aggregate_key(category / sub_category / product_id, quantity = sum(quantity))

unfortunately the fable package doesn’t have a case when option for setting models, so while it’s a little time consuming, we’ll set an ETS model and cronston model to all products and levels of aggregation. remember those 97 products that errored out before? Now they’ll become an issue, because the reconciling approach cannot accept null models. Lets drop those out before fitting the model

ttl_ag_fit_test <-
  data2_ts_monthly_ag_test |> model(
    arima = ARIMA(quantity),
    ets = ETS(quantity),
    cronston = CROSTON(quantity)
  )

ttl_ag_fit_test <-  ttl_ag_fit_test |> 
  mutate(combo = (arima + ets) / 2) |>
  select(-ets, -arima)

6.4.1 Set up model

This forecast package right now doesn’t have the option to chose model based on criteria, but we can generate models for all, and then bind the model table together based on criteria. It’s much more CPU intensive, but fortunately modeling on criteria is on the list of enhancements for the package!

ags <- ttl_ag_fit_test |> filter(is_aggregated(product_id)) |> select(-cronston) |> rename(.model = combo)
prods <- ttl_ag_fit_test |> filter(!is_aggregated(product_id)) |> select(-combo) |> rename(.model = cronston)

ags
# A mable: 21 x 4
# Key:     category, sub_category, product_id [21]
   category        sub_category product_id          .model
   <chr*>          <chr*>       <chr*>             <model>
 1 Furniture       Bookcases    <aggregated> <COMBINATION>
 2 Furniture       Chairs       <aggregated> <COMBINATION>
 3 Furniture       Furnishings  <aggregated> <COMBINATION>
 4 Furniture       Tables       <aggregated> <COMBINATION>
 5 Furniture       <aggregated> <aggregated> <COMBINATION>
 6 Office Supplies Appliances   <aggregated> <COMBINATION>
 7 Office Supplies Art          <aggregated> <COMBINATION>
 8 Office Supplies Binders      <aggregated> <COMBINATION>
 9 Office Supplies Envelopes    <aggregated> <COMBINATION>
10 Office Supplies Fasteners    <aggregated> <COMBINATION>
# … with 11 more rows
prods
# A mable: 1,565 x 4
# Key:     category, sub_category, product_id [1,565]
   category  sub_category product_id         .model
   <chr*>    <chr*>       <chr*>            <model>
 1 Furniture Bookcases    FUR-BO-10000330 <croston>
 2 Furniture Bookcases    FUR-BO-10000362 <croston>
 3 Furniture Bookcases    FUR-BO-10000468 <croston>
 4 Furniture Bookcases    FUR-BO-10000711 <croston>
 5 Furniture Bookcases    FUR-BO-10000780 <croston>
 6 Furniture Bookcases    FUR-BO-10001337 <croston>
 7 Furniture Bookcases    FUR-BO-10001519 <croston>
 8 Furniture Bookcases    FUR-BO-10001608 <croston>
 9 Furniture Bookcases    FUR-BO-10001798 <croston>
10 Furniture Bookcases    FUR-BO-10001811 <croston>
# … with 1,555 more rows
ttl_ag_fit_test_bind <- bind_rows(ags,prods)
ttl_ag_fit_test_bind_rec <- ttl_ag_fit_test_bind |> 
 reconcile(ols = min_trace(.model, method = "ols"))
ttl_ag_fit_test_fc <- ttl_ag_fit_test_bind_rec |> 
  forecast(h = 12)

Lets visualize how this forecast looks compared to actuals, and it looks pretty reasonable!

data2_ts_monthly_ag_test_full <- data2_ts_monthly |> 
  filter(!(product_id %in% monthly_cronston_nulls_test$product_id)) |> 
  aggregate_key(category / sub_category / product_id, quantity = sum(quantity))

ttl_ag_fit_test_fc |> filter(is_aggregated(category)) |> 
  autoplot(data2_ts_monthly_ag_test_full, level = NULL) +
  theme_minimal()

ttl_ag_fit_test_fc |> filter(is_aggregated(sub_category), !is_aggregated(category)) |> 
  autoplot(data2_ts_monthly_ag_test_full, level = NULL) +
  theme_minimal() +
  facet_wrap(vars(category), ncol = 3, scales = "free")

ttl_ag_fit_test_fc |> filter(category == "Furniture", !is_aggregated(sub_category), is_aggregated(product_id)) |> 
  autoplot(data2_ts_monthly_ag_test_full, level = NULL) +
  theme_minimal() +
  facet_wrap(vars(sub_category), scales = "free")

6.4.2 Compare to 2017 actuals

Lets check the MASE for this new forecast method…and we see that this method is actually hurting our accuracy a little bit compared to the one step ahead naive forecast. But the benefit is that this product level forecast rolls up to a total forecast that actually makes sense and fits history! Depending on the business needs this is a worthwhile trade off because the accuracy losses are very minimal.

We could of course have a process to review products with the highest MASE score, but that’s outside my scope for this document.

Another thing this tells us is how impactful those 297 dropped products where to forecast accuracy. This gains in MASE aren’t from the detailed product_id level forecasts, but because products with their first or second sale in 2017 accounted for a significant portion of the growth

-We could rerun our category/sub_category forecast, dropping out those 297 products to get a fair comparison between the methods but this report already takes 45min to render with all the modeling so I’m not going to do it here

fa1 <- ttl_ag_fit_test_fc |>
  filter(is_aggregated(category), is_aggregated(sub_category), is_aggregated(product_id)) |>
  accuracy(
    data = data2_ts_monthly_ag_test_full,
    measures = list(mase = MASE)
  ) |>
  group_by(.model) |>
  summarize(mase = mean(mase)) |>
  mutate(level = "total")


fa2 <- ttl_ag_fit_test_fc |>
  filter(!is_aggregated(category), is_aggregated(sub_category), is_aggregated(product_id)) |>
  accuracy(
    data = data2_ts_monthly_ag_test_full,
    measures = list(mase = MASE)
  ) |>
  group_by(.model) |>
  summarize(mase = mean(mase)) |>
  mutate(level = "category")


fa3 <- ttl_ag_fit_test_fc |>
  filter(!is_aggregated(category), !is_aggregated(sub_category), is_aggregated(product_id)) |>
  accuracy(
    data = data2_ts_monthly_ag_test_full,
    measures = list(mase = MASE)
  ) |>
  group_by(.model) |>
  summarize(mase = mean(mase)) |>
  mutate(level = "sub_category")


fa4 <-  ttl_ag_fit_test_fc |>
  filter(!is_aggregated(category), !is_aggregated(sub_category), !is_aggregated(product_id)) |>
  accuracy(
    data = data2_ts_monthly_ag_test_full,
    measures = list(mase = MASE)
  ) |>
  group_by(.model) |>
  summarize(mase = mean(mase)) |>
  mutate(level = "product_id")

accuracytable2 <- bind_rows(fa1, fa2, fa3, fa4)

accuracytable2 <- accuracytable2 |> pivot_wider(names_from = .model, values_from = mase)

kable(accuracytable2,
  digits = 2)
level .model ols
total 0.90 0.91
category 0.98 1.01
sub_category 1.01 1.01
product_id 1.56 1.63

6.4.3 Forecast out future

Lets save some computing time and calculate check which products only have one sales record. Generates list of 97, we’ll drop these out at the beginning because we’ll get null models

data2_ts_monthly |> as_tibble() |> mutate(helper = ifelse(quantity > 0, 1, 0)) |> group_by(product_id) |> 
  summarize(helper = sum(helper)) |> filter(helper < 2)
# A tibble: 97 × 2
   product_id      helper
   <chr>            <dbl>
 1 FUR-BO-10000112      1
 2 FUR-BO-10001567      1
 3 FUR-BO-10002206      1
 4 FUR-CH-10002317      1
 5 FUR-FU-10002874      1
 6 FUR-FU-10003095      1
 7 FUR-TA-10001691      1
 8 OFF-AP-10000326      1
 9 OFF-AP-10001124      1
10 OFF-AP-10002203      1
# … with 87 more rows
drops <- data2_ts_monthly |> as_tibble() |> mutate(helper = ifelse(quantity > 0, 1, 0)) |> group_by(product_id) |> 
  summarize(helper = sum(helper)) |> filter(helper < 2) |> select(product_id)

Create table dropping out those products

data2_ts_monthly_wdrops <- data2_ts_monthly |> filter(!product_id %in% drops$product_id)

create time series object without those products at monthly level

data2_ts_monthly_wdrops <- data2_ts |>
  as_tibble() |>
  filter(!(product_id %in% drops$product_id)) |>
  group_by(
    month = yearmonth(order_date),
    product_id,
    category,
    sub_category
  ) |>
  summarise(quantity = sum(quantity), .groups = "drop") |>
  as_tsibble(
    key = c("product_id", "category", "sub_category"),
    index = month
  )

create aggregation

data2_ts_monthly_wdrops_ag <- data2_ts_monthly_wdrops |>
  aggregate_key(category / sub_category / product_id, quantity = sum(quantity))

fit ets/arima combo and cronston model. combo will be used for cat thru sub_cat, cronston for product level

ttl_ag_wdrops_fit <-
  data2_ts_monthly_wdrops_ag |> model(
    ets = ETS(quantity),
    arimia = ARIMA(quantity),
    cronston = CROSTON(quantity)
  )
Warning in sqrt(diag(best$var.coef)): NaNs produced
ttl_ag_wdrops_fit <- ttl_ag_wdrops_fit |> 
  mutate(combo = (arimia + ets) / 2) |> 
  select(-ets, -arimia)

splice our mable (model table) by product hierarchy, select appropriate model for each level, rebind it together. dplyer makes it easy to bind matrices together using bind_rows. In the code below ags are the cat thru sub_cat models, prods are the product_id level models.

ags <- ttl_ag_wdrops_fit |>
  filter(is_aggregated(product_id)) |>
  select(-cronston) |> 
  rename(.model = combo)
prods <- ttl_ag_wdrops_fit |>
  filter(!is_aggregated(product_id)) |>
  select(-combo) |>
  rename(.model = cronston)

ttl_ag_wdrops_fit <- bind_rows(ags, prods)

take our new mable and add reconcile method min_trace

ttl_ag_wdrops_fit_rec <-
  ttl_ag_wdrops_fit |>
  reconcile(ols = min_trace(.model, method = "ols"))

forecast out 12 months

ttl_ag_wdrops_fit_rec_fc <- ttl_ag_wdrops_fit_rec |>
  forecast(h = 12)
for (target_category in unique(data2$category)) {

p1 <- ttl_ag_wdrops_fit_rec_fc |> 
  filter(category == target_category,
    is_aggregated(sub_category), 
    is_aggregated(product_id),
    .model == "ols") |> 
  autoplot(data2_ts_monthly_wdrops_ag, level = NULL) +
  theme_minimal() +
  theme(legend.position = "none") +
  labs(title = paste0(target_category," ", "Category/Sub_category/Product_id", " ", "Forecast"))
 
p2 <- fc |>
  filter(category == target_category,
    is_aggregated(sub_category), is_aggregated(segment), .model == "mint") |>
  autoplot(fc_data_wsegment, level = NULL) +
  theme_minimal() +
    labs(title = paste0(target_category," ", "Category/Sub_category", " ", "Forecast"))

print(p1/p2)
}

catsubcat_tibble <- as_tibble(fc) |> filter(!is_aggregated(category), !is_aggregated(segment), !is_aggregated(sub_category),
                                     .model == "mint") |> 
  mutate(category = as.character(category),
         sub_category = as.character(sub_category),
         .model = "cat/sub_catForecast",
         month = yearmonth(month),
         quantity = .mean) |> 
  select(
          category, sub_category, .model, month, quantity) |> 
  group_by(category, sub_category, month, .model) |> 
  summarise(quantity = sum(quantity), .groups = "drop")

catsubcatprod_tibble <- as_tibble(ttl_ag_wdrops_fit_rec_fc) |> 
  filter(!is_aggregated(category), !is_aggregated(sub_category), !is_aggregated(product_id),
         .model == "ols") |> 
  mutate(category = as.character(category),
         sub_category = as.character(sub_category),
         .model = "cat/sub_cat/product_idForecast",
         month = yearmonth(month),
         quantity = .mean) |> 
  select(
        category, sub_category, .model, month, quantity) |> 
  group_by(category, sub_category, month, .model) |> 
  summarise(quantity = sum(quantity), .groups = "drop")

history <- data2_ts_monthly |> 
  as_tibble() |>  
  group_by(month, category, sub_category) |> 
  summarise(quantity = sum(quantity), .groups = "drop") |> 
  mutate(.model = "history")


catsubcat_tibble <- catsubcat_tibble |> select(month, category, sub_category, quantity, .model)
catsubcatprod_tibble <- catsubcatprod_tibble |> select(month, category, sub_category, quantity, .model)

asdf <- rbind(catsubcat_tibble, catsubcatprod_tibble, history)

#totalforecast

asdf |> group_by(month, .model) |> summarise(quantity = sum(quantity), .groups = "drop") |> 
  ggplot(aes(x = month, y = quantity, color = .model)) +
  geom_line() +
  theme_minimal() +
  labs(title = "Total")
asdf |> 
    group_by(month, .model, category) |> 
    summarise(quantity = sum(quantity), .groups = "drop") |> 
    ggplot(aes(x = month, y = quantity, color = .model)) +
    geom_line() +
    theme_minimal() +
    labs(title = "Category") +
  facet_wrap(vars(category), scales = "free", ncol = 1)
for (target_category in unique(data2$category)) {
  p1 <- asdf |> filter(category == target_category) |> 
    group_by(month, .model, sub_category) |> 
    summarise(quantity = sum(quantity), .groups = "drop") |> 
     ggplot(aes(x = month, y = quantity, color = .model)) +
    geom_line() +
    theme_minimal() +
    labs(title = target_category) +
  facet_wrap(vars(sub_category), scales = "free")
  
  print(p1)
}

6.5 Forecasting as integer values

When forecasting at the category level, we were okay rounding the forecast values to whole numbers, and dropping the few negative numbers to zero because we had high volume. The median forecast value for the new product_id level forecast is only .6. If we were just to round we would wipe out a lot of demand.

Depending on what the end use for the forecast is, we might need to fight a way to get these to whole numbers. I haven’t seen a case where MRP can accept anything else, so we’ll move to roll these up

item_fcst <- ttl_ag_wdrops_fit_rec_fc |> 
  as_tibble() |> 
  filter(!is_aggregated(product_id), .model == "ols") |> 
  mutate(category = as.character(category), sub_category = as.character(sub_category), product_id = as.character(product_id)) |> 
  select(month, category, sub_category, product_id, .mean)

x <- round(median(item_fcst$.mean), 3)

item_fcst |> ggplot(aes(x = .mean)) +
  geom_histogram(binwidth = 0.1, col = "white", fill = "grey4") +
  geom_vline(xintercept = x, col = "red", size = 1) +
  theme_minimal() +
  labs(
    title = paste("Median fcst value is ", x, " per month"),
    x = "forecast"
  ) +
  scale_x_continuous(breaks = seq(0,100,by = 1))
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

One method is to sum the sequences of forecasts, and make a whole number record once the cumulative sum reaches a new threshold. For example, a forecast of 0.25, 0.25, 0.25, 0.25, 0.25 would create a cumulative sum of 0.25, 0.5, 0.75, 1.0, 1.25, and the forecast values would be 0, 0, 0, 1, 0.

The code below accomplishes this by creating a cumulative sum for each forecast, a rolling calculation taking the floor value (e.g. floor of 1.9999 would be 1L, and 2.000001 would be 2L. Then a third column checking the difference at lag 1 and recording when it reaches a new integer threshold.

Wow! To my surprise, the rolling int method actually worked pretty good aligning with our higher level forecasts. Only problem is that of course we have ZERO forecast for the first month, because there is no difference avail for the first month because there is now prior period!

This may not be a problem for supply chain planning if they have a frozen forecast zone (since we wouldn’t be able to change the forecast for one month ahead.

item_fct_int <- item_fcst |>
  arrange(product_id, month) |>
  group_by(product_id) |>
  mutate(
    csum = cumsum(.mean),
    rolling_int = floor(csum),
    int_fcst = rolling_int - lag(rolling_int, n = 1),
    int_fcst = ifelse(is.na(int_fcst), 0, int_fcst)
  )

item_fct_int |> 
  mutate(.mean = round(.mean, 2),
         csum = round(csum, 2)) |> 
  select(month, product_id, .mean, csum, rolling_int, int_fcst) |> 
  head(24) |> 
  kable(caption = "Example with two products \n
        .mean = forecast value \n
        csum = cummulative sum of forecast vlues \n
        rolling_in = floor value of csum \n
        int_fcst = our forcast value, that takes rolling_int difference at lag 1",
        booktabs = T)
Example with two products .mean = forecast value csum = cummulative sum of forecast vlues rolling_in = floor value of csum int_fcst = our forcast value, that takes rolling_int difference at lag 1
month product_id .mean csum rolling_int int_fcst
2018 Jan FUR-BO-10000330 0.16 0.16 0 0
2018 Feb FUR-BO-10000330 0.11 0.27 0 0
2018 Mar FUR-BO-10000330 0.21 0.48 0 0
2018 Apr FUR-BO-10000330 0.29 0.77 0 0
2018 May FUR-BO-10000330 0.30 1.07 1 1
2018 Jun FUR-BO-10000330 0.37 1.44 1 0
2018 Jul FUR-BO-10000330 0.23 1.66 1 0
2018 Aug FUR-BO-10000330 0.19 1.86 1 0
2018 Sep FUR-BO-10000330 0.39 2.25 2 1
2018 Oct FUR-BO-10000330 0.33 2.58 2 0
2018 Nov FUR-BO-10000330 0.48 3.06 3 1
2018 Dec FUR-BO-10000330 0.62 3.67 3 0
2018 Jan FUR-BO-10000362 0.34 0.34 0 0
2018 Feb FUR-BO-10000362 0.29 0.63 0 0
2018 Mar FUR-BO-10000362 0.39 1.02 1 1
2018 Apr FUR-BO-10000362 0.47 1.49 1 0
2018 May FUR-BO-10000362 0.48 1.97 1 0
2018 Jun FUR-BO-10000362 0.55 2.51 2 1
2018 Jul FUR-BO-10000362 0.41 2.92 2 0
2018 Aug FUR-BO-10000362 0.37 3.29 3 1
2018 Sep FUR-BO-10000362 0.57 3.86 3 0
2018 Oct FUR-BO-10000362 0.51 4.37 4 1
2018 Nov FUR-BO-10000362 0.66 5.03 5 1
2018 Dec FUR-BO-10000362 0.80 5.82 5 0
item_fct_int |>
  group_by(month, category) |>
  summarise(
    dbl_fcst = sum(.mean, na.rm = T),
    int_fcst = sum(int_fcst, na.rm = T),
    .groups = "drop"
  ) |> 
  pivot_longer(cols = c("dbl_fcst", "int_fcst"), names_to = "fcst", values_to = "quantity") |> 
  ggplot(aes(x = month, y = quantity)) +
  geom_line(aes(color = fcst)) +
  theme_minimal() +
  facet_wrap(vars(category), scales = "free") + 
  theme(axis.text.x = element_text(vjust = .5, angle = 45))

Its good practice to review top products. Ideally there is a set process but for this we’ll just take a look at the top 9 since it fits nicely in a 3x3 grid. Below is history, our reconciled hierarchical forecast as a decimal “dbl_fcst” and the rolling integer version (int_fcst).

Our integer forecast is aligning pretty nicely with the decimal forecast, but the forecast might be going off the rails a little. This would be a good follow up opportunity with the sales team to figure out what the root cause of increased sales are.

Recall back to our histogram in the data exploration, these high monthly volumes in 2017 are way outside the normal sales and clearly unusual.

top9_list <- ttl_ag_wdrops_fit_rec_fc |> as_tibble() |> 
  filter(!is_aggregated(product_id), .model == "ols") |> 
  select(product_id, .mean) |> 
  mutate(product_id = as.character(product_id)) |> 
  group_by(product_id) |> 
  summarize(quantity = sum(.mean), .groups = "drop") |> 
  arrange(desc(quantity)) |> 
  top_n(9, quantity)

a <- ttl_ag_wdrops_fit_rec_fc |> filter(!is_aggregated(product_id), .model == "ols") |> 
  as_tibble() |> 
  mutate(product_id = as.character(product_id),
         quantity = .mean) |> 
  filter(product_id %in% top9_list$product_id) |> 
  select(product_id, month, .model, quantity) |> 
  rename(type = .model) |> 
  mutate(type = "dbl_fcst")

b <- item_fct_int |> 
  filter(product_id %in% top9_list$product_id) |> 
  mutate(type = "int_fcst") |> 
  select(product_id, month, type, int_fcst) |> 
  rename(quantity = int_fcst)

c <- data2_ts_monthly |> as_tibble() |> 
  mutate(type = "history") |> 
  filter(product_id %in% top9_list$product_id) |> 
  select(product_id, month, type, quantity)

asdf2 <- rbind(a,b, c) 

for (target_prod in unique(asdf2$product_id)) {
 p1 <-  asdf2 |> 
   filter(product_id == target_prod) |> 
    ggplot(aes(x = month, y = quantity, color = type)) +
    geom_line() +
    theme_minimal() +
  labs(title = paste0(target_prod, " Forecast"))
  
  print(p1)
}